DISCOVERY of LONGEST INCREASING SUBSEQUENCES and its VARIANTS using DNA OPERATIONS

نویسنده

  • A. MURUGAN
چکیده

The Longest Increasing Subsequence (LIS) and Common Longest Increasing Subsequence (CLIS) have their importance in many data mining applications. We propose algorithms to discover LIS and CLIS from varied databases. This work finds all increasing subsequences from the given database, find increasing subsequences in n sliding window, longest increasing sequences in one and more sequences, decreasing subsequences and common increasing sequences of varied window sizes. The proposed work can be applied to finding diverging patterns, constraint LIS, sequence alignment, find motifs in genetic data bases, pattern recognition, mine emerging patterns, and contrast patterns in both, scientific and commercial databases. The algorithms are implemented and tested for accuracy in both real and simulated databases. Finally, the validity of the algorithms are checked and their time complexity are analyzed.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discovering sequence motifs of different patterns parallel using DNA operations

Discovery of motifs in biological sequences and various types of subsequences in commercial databases have varied applications and interpretations. This paper proposes a new approach to solve the Combinatorial Pattern Matching (CPM), search for continuous and gapped rigid subsequences and discover Longest Common Rigid Subsequences (LCRS) from the given sequences using DNA operations and modifie...

متن کامل

Subsequence Combinatorics and Applications to Microarray Production, DNA Sequencing and Chaining Algorithms

We investigate combinatorial enumeration problems related to subsequences of strings; in contrast to substrings, subsequences need not be contiguous. For a finite alphabet Σ, the following three problems are solved. (1) Number of distinct subsequences: Given a sequence s ∈ Σ and a nonnegative integer k ≤ n, how many distinct subsequences of length k does s contain? A previous result by Chase st...

متن کامل

Mining Biological Repetitive Sequences Using Support Vector Machines and Fuzzy SVM

Structural repetitive subsequences are most important portion of biological sequences, which play crucial roles on corresponding sequence’s fold and functionality. Biggest class of the repetitive subsequences is “Transposable Elements” which has its own sub-classes upon contexts’ structures. Many researches have been performed to criticality determine the structure and function of repetitiv...

متن کامل

Increasing and Decreasing Subsequences and Their Variants

We survey the theory of increasing and decreasing subsequences of permutations. Enumeration problems in this area are closely related to the RSK algorithm. The asymptotic behavior of the expected value of the length is(w) of the longest increasing subsequence of a permutation w of 1, 2, . . . , n was obtained by Vershik-Kerov and (almost) by Logan-Shepp. The entire limiting distribution of is(w...

متن کامل

Enumerating Longest Increasing Subsequences and Patience Sorting Enumerating Longest Increasing Subsequences and Patience Sorting

In this paper we present three algorithms that solve three combinatorial optimization problems related to each other. One of them is the patience sorting game, invented as a practical method of sorting real decks of cards. The second problem is computing the longest monotone increasing subsequence of the given sequence of n positive integers in the range 1; : : : ; n. The third problem is to en...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013